Búsqueda | Portal Regional de la BVS

1.

CATH 2024: CATH-AlphaFlow Doubles the Number of Structures in CATH and Reveals Nearly 200 New Folds.

Waman, Vaishali P; Bordin, Nicola; Alcraft, Rachel; Vickerstaff, Robert; Rauer, Clemens; Chan, Qian; Sillitoe, Ian; Yamamori, Hazuki; Orengo, Christine.

J Mol Biol ; : 168551, 2024 Mar 27.

Artículo en Inglés | MEDLINE | ID: mdl-38548261

RESUMEN

CATH (https://www.cathdb.info) classifies domain structures from experimental protein structures in the PDB and predicted structures in the AlphaFold Database (AFDB). To cope with the scale of the predicted data a new NextFlow workflow (CATH-AlphaFlow), has been developed to classify high-quality domains into CATH superfamilies and identify novel fold groups and superfamilies. CATH-AlphaFlow uses a novel state-of-the-art structure-based domain boundary prediction method (ChainSaw) for identifying domains in multi-domain proteins. We applied CATH-AlphaFlow to process PDB structures not classified in CATH and AFDB structures from 21 model organisms, expanding CATH by over 100%. Domains not classified in existing CATH superfamilies or fold groups were used to seed novel folds, giving 253 new folds from PDB structures (September 2023 release) and 96 from AFDB structures of proteomes of 21 model organisms. Where possible, functional annotations were obtained using (i) predictions from publicly available methods (ii) annotations from structural relatives in AFDB/UniProt50. We also predicted functional sites and highly conserved residues. Some folds are associated with important functions such as photosynthetic acclimation (in flowering plants), iron permease activity (in fungi) and post-natal spermatogenesis (in mice). CATH-AlphaFlow will allow us to identify many more CATH relatives in the AFDB, further characterising the protein structure landscape.

2.

KinFams: De-Novo Classification of Protein Kinases Using CATH Functional Units.

Adeyelu, Tolulope; Bordin, Nicola; Waman, Vaishali P; Sadlej, Marta; Sillitoe, Ian; Moya-Garcia, Aurelio A; Orengo, Christine A.

Biomolecules ; 13(2)2023 02 02.

Artículo en Inglés | MEDLINE | ID: mdl-36830646

RESUMEN

Protein kinases are important targets for treating human disorders, and they are the second most targeted families after G-protein coupled receptors. Several resources provide classification of kinases into evolutionary families (based on sequence homology); however, very few systematically classify functional families (FunFams) comprising evolutionary relatives that share similar functional properties. We have developed the FunFam-MARC (Multidomain ARchitecture-based Clustering) protocol, which uses multi-domain architectures of protein kinases and specificity-determining residues for functional family classification. FunFam-MARC predicts 2210 kinase functional families (KinFams), which have increased functional coherence, in terms of EC annotations, compared to the widely used KinBase classification. Our protocol provides a comprehensive classification for kinase sequences from >10,000 organisms. We associate human KinFams with diseases and drugs and identify 28 druggable human KinFams, i.e., enriched in clinically approved drugs. Since relatives in the same druggable KinFam tend to be structurally conserved, including the drug-binding site, these KinFams may be valuable for shortlisting therapeutic targets. Information on the human KinFams and associated 3D structures from AlphaFold2 are provided via our CATH FTP website and Zenodo. This gives the domain structure representative of each KinFam together with information on any drug compounds available. For 32% of the KinFams, we provide information on highly conserved residue sites that may be associated with specificity.

Asunto(s)

Proteínas Quinasas , Proteínas , Humanos , Proteínas Quinasas/metabolismo , Proteínas/química , Bases de Datos de Proteínas , Homología de Secuencia de Aminoácido

3.

AlphaFold2 reveals commonalities and novelties in protein structure space for 21 model organisms.

Bordin, Nicola; Sillitoe, Ian; Nallapareddy, Vamsi; Rauer, Clemens; Lam, Su Datt; Waman, Vaishali P; Sen, Neeladri; Heinzinger, Michael; Littmann, Maria; Kim, Stephanie; Velankar, Sameer; Steinegger, Martin; Rost, Burkhard; Orengo, Christine.

Commun Biol ; 6(1): 160, 2023 02 08.

Artículo en Inglés | MEDLINE | ID: mdl-36755055

RESUMEN

Deep-learning (DL) methods like DeepMind's AlphaFold2 (AF2) have led to substantial improvements in protein structure prediction. We analyse confident AF2 models from 21 model organisms using a new classification protocol (CATH-Assign) which exploits novel DL methods for structural comparison and classification. Of ~370,000 confident models, 92% can be assigned to 3253 superfamilies in our CATH domain superfamily classification. The remaining cluster into 2367 putative novel superfamilies. Detailed manual analysis on 618 of these, having at least one human relative, reveal extremely remote homologies and further unusual features. Only 25 novel superfamilies could be confirmed. Although most models map to existing superfamilies, AF2 domains expand CATH by 67% and increases the number of unique 'global' folds by 36% and will provide valuable insights on structure function relationships. CATH-Assign will harness the huge expansion in structural data provided by DeepMind to rationalise evolutionary changes driving functional divergence.

Asunto(s)

Furilfuramida , Proteínas , Humanos , Bases de Datos de Proteínas , Proteínas/química

4.

CATHe: detection of remote homologues for CATH superfamilies using embeddings from protein language models.

Nallapareddy, Vamsi; Bordin, Nicola; Sillitoe, Ian; Heinzinger, Michael; Littmann, Maria; Waman, Vaishali P; Sen, Neeladri; Rost, Burkhard; Orengo, Christine.

Bioinformatics ; 39(1)2023 01 01.

Artículo en Inglés | MEDLINE | ID: mdl-36648327

RESUMEN

MOTIVATION: CATH is a protein domain classification resource that exploits an automated workflow of structure and sequence comparison alongside expert manual curation to construct a hierarchical classification of evolutionary and structural relationships. The aim of this study was to develop algorithms for detecting remote homologues missed by state-of-the-art hidden Markov model (HMM)-based approaches. The method developed (CATHe) combines a neural network with sequence representations obtained from protein language models. It was assessed using a dataset of remote homologues having less than 20% sequence identity to any domain in the training set. RESULTS: The CATHe models trained on 1773 largest and 50 largest CATH superfamilies had an accuracy of 85.6 ± 0.4% and 98.2 ± 0.3%, respectively. As a further test of the power of CATHe to detect more remote homologues missed by HMMs derived from CATH domains, we used a dataset consisting of protein domains that had annotations in Pfam, but not in CATH. By using highly reliable CATHe predictions (expected error rate <0.5%), we were able to provide CATH annotations for 4.62 million Pfam domains. For a subset of these domains from Homo sapiens, we structurally validated 90.86% of the predictions by comparing their corresponding AlphaFold2 structures with structures from the CATH superfamilies to which they were assigned. AVAILABILITY AND IMPLEMENTATION: The code for the developed models is available on https://github.com/vam-sin/CATHe, and the datasets developed in this study can be accessed on https://zenodo.org/record/6327572. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Asunto(s)

Algoritmos , Proteínas , Humanos , Homología de Secuencia de Aminoácido , Proteínas/química , Bases de Datos de Proteínas

5.

Structural and energetic analyses of SARS-CoV-2 N-terminal domain characterise sugar binding pockets and suggest putative impacts of variants on COVID-19 transmission.

Lam, Su Datt; Waman, Vaishali P; Fraternali, Franca; Orengo, Christine; Lees, Jonathan.

Comput Struct Biotechnol J ; 20: 6302-6316, 2022.

Artículo en Inglés | MEDLINE | ID: mdl-36408455

RESUMEN

Coronavirus disease 2019 (COVID-19) caused by SARS-CoV-2 is an ongoing pandemic that causes significant health/socioeconomic burden. Variants of concern (VOCs) have emerged affecting transmissibility, disease severity and re-infection risk. Studies suggest that the - N-terminal domain (NTD) of the spike protein may have a role in facilitating virus entry via sialic-acid receptor binding. Furthermore, most VOCs include novel NTD variants. Despite global sequence and structure similarity, most sialic-acid binding pockets in NTD vary across coronaviruses. Our work suggests ongoing evolutionary tuning of the sugar-binding pockets and recent analyses have shown that NTD insertions in VOCs tend to lie close to loops. We extended the structural characterisation of these sugar-binding pockets and explored whether variants could enhance sialic acid-binding. We found that recent NTD insertions in VOCs (i.e., Gamma, Delta and Omicron variants) and emerging variants of interest (VOIs) (i.e., Iota, Lambda and Theta variants) frequently lie close to sugar-binding pockets. For some variants, including the recent Omicron VOC, we find increases in predicted sialic acid-binding energy, compared to the original SARS-CoV-2, which may contribute to increased transmission. These binding observations are supported by molecular dynamics simulations (MD). We examined the similarity of NTD across Betacoronaviruses to determine whether the sugar-binding pockets are sufficiently similar to be exploited in drug design. Whilst most pockets are too structurally variable, we detected a previously unknown highly structurally conserved pocket which can be investigated in pursuit of a generic pan-Betacoronavirus drug. Our structure-based analyses help rationalise the effects of VOCs and provide hypotheses for experiments. Our findings suggest a strong need for experimental monitoring of changes in NTD of VOCs.

6.

Three-dimensional Structure Databases of Biological Macromolecules.

Waman, Vaishali P; Orengo, Christine; Kleywegt, Gerard J; Lesk, Arthur M.

Methods Mol Biol ; 2449: 43-91, 2022.

Artículo en Inglés | MEDLINE | ID: mdl-35507259

RESUMEN

Databases of three-dimensional structures of proteins (and their associated molecules) provide: (a) Curated repositories of coordinates of experimentally determined structures, including extensive metadata; for instance information about provenance, details about data collection and interpretation, and validation of results. (b) Information-retrieval tools to allow searching to identify entries of interest and provide access to them. (c) Links among databases, especially to databases of amino-acid and genetic sequences, and of protein function; and links to software for analysis of amino-acid sequence and protein structure, and for structure prediction. (d) Collections of predicted three-dimensional structures of proteins. These will become more and more important after the breakthrough in structure prediction achieved by AlphaFold2. The single global archive of experimentally determined biomacromolecular structures is the Protein Data Bank (PDB). It is managed by wwPDB, a consortium of five partner institutions: the Protein Data Bank in Europe (PDBe), the Research Collaboratory for Structural Bioinformatics (RCSB), the Protein Data Bank Japan (PDBj), the BioMagResBank (BMRB), and the Electron Microscopy Data Bank (EMDB). In addition to jointly managing the PDB repository, the individual wwPDB partners offer many tools for analysis of protein and nucleic acid structures and their complexes, including providing computer-graphic representations. Their collective and individual websites serve as hubs of the community of structural biologists, offering newsletters, reports from Task Forces, training courses, and "helpdesks," as well as links to external software.Many specialized projects are based on the information contained in the PDB. Especially important are SCOP, CATH, and ECOD, which present classifications of protein domains.

Asunto(s)

Proteínas , Programas Informáticos , Biología Computacional , Bases de Datos de Proteínas , Conformación Proteica , Proteínas/química

7.

Computational approaches to predict protein functional families and functional sites.

Rauer, Clemens; Sen, Neeladri; Waman, Vaishali P; Abbasian, Mahnaz; Orengo, Christine A.

Curr Opin Struct Biol ; 70: 108-122, 2021 10.

Artículo en Inglés | MEDLINE | ID: mdl-34225010

RESUMEN

Understanding the mechanisms of protein function is indispensable for many biological applications, such as protein engineering and drug design. However, experimental annotations are sparse, and therefore, theoretical strategies are needed to fill the gap. Here, we present the latest developments in building functional subclassifications of protein superfamilies and using evolutionary conservation to detect functional determinants, for example, catalytic-, binding- and specificity-determining residues important for delineating the functional families. We also briefly review other features exploited for functional site detection and new machine learning strategies for combining multiple features.

Asunto(s)

Evolución Biológica , Proteínas , Sitios de Unión , Catálisis , Biología Computacional , Humanos , Aprendizaje Automático , Ingeniería de Proteínas , Proteínas/genética

8.

The impact of structural bioinformatics tools and resources on SARS-CoV-2 research and therapeutic strategies.

Waman, Vaishali P; Sen, Neeladri; Varadi, Mihaly; Daina, Antoine; Wodak, Shoshana J; Zoete, Vincent; Velankar, Sameer; Orengo, Christine.

Brief Bioinform ; 22(2): 742-768, 2021 03 22.

Artículo en Inglés | MEDLINE | ID: mdl-33348379

RESUMEN

SARS-CoV-2 is the causative agent of COVID-19, the ongoing global pandemic. It has posed a worldwide challenge to human health as no effective treatment is currently available to combat the disease. Its severity has led to unprecedented collaborative initiatives for therapeutic solutions against COVID-19. Studies resorting to structure-based drug design for COVID-19 are plethoric and show good promise. Structural biology provides key insights into 3D structures, critical residues/mutations in SARS-CoV-2 proteins, implicated in infectivity, molecular recognition and susceptibility to a broad range of host species. The detailed understanding of viral proteins and their complexes with host receptors and candidate epitope/lead compounds is the key to developing a structure-guided therapeutic design. Since the discovery of SARS-CoV-2, several structures of its proteins have been determined experimentally at an unprecedented speed and deposited in the Protein Data Bank. Further, specialized structural bioinformatics tools and resources have been developed for theoretical models, data on protein dynamics from computer simulations, impact of variants/mutations and molecular therapeutics. Here, we provide an overview of ongoing efforts on developing structural bioinformatics tools and resources for COVID-19 research. We also discuss the impact of these resources and structure-based studies, to understand various aspects of SARS-CoV-2 infection and therapeutic development. These include (i) understanding differences between SARS-CoV-2 and SARS-CoV, leading to increased infectivity of SARS-CoV-2, (ii) deciphering key residues in the SARS-CoV-2 involved in receptor-antibody recognition, (iii) analysis of variants in host proteins that affect host susceptibility to infection and (iv) analyses facilitating structure-based drug and vaccine design against SARS-CoV-2.

Asunto(s)

Antivirales/uso terapéutico , Tratamiento Farmacológico de COVID-19 , Biología Computacional , SARS-CoV-2/aislamiento & purificación , COVID-19/virología , Humanos , Conformación Proteica , Proteínas Virales/química

9.

CATH: increased structural coverage of functional space.

Sillitoe, Ian; Bordin, Nicola; Dawson, Natalie; Waman, Vaishali P; Ashford, Paul; Scholes, Harry M; Pang, Camilla S M; Woodridge, Laurel; Rauer, Clemens; Sen, Neeladri; Abbasian, Mahnaz; Le Cornu, Sean; Lam, Su Datt; Berka, Karel; Varekova, Ivana Hutarová; Svobodova, Radka; Lees, Jon; Orengo, Christine A.

Nucleic Acids Res ; 49(D1): D266-D273, 2021 01 08.

Artículo en Inglés | MEDLINE | ID: mdl-33237325

RESUMEN

CATH (https://www.cathdb.info) identifies domains in protein structures from wwPDB and classifies these into evolutionary superfamilies, thereby providing structural and functional annotations. There are two levels: CATH-B, a daily snapshot of the latest domain structures and superfamily assignments, and CATH+, with additional derived data, such as predicted sequence domains, and functionally coherent sequence subsets (Functional Families or FunFams). The latest CATH+ release, version 4.3, significantly increases coverage of structural and sequence data, with an addition of 65,351 fully-classified domains structures (+15%), providing 500 238 structural domains, and 151 million predicted sequence domains (+59%) assigned to 5481 superfamilies. The FunFam generation pipeline has been re-engineered to cope with the increased influx of data. Three times more sequences are captured in FunFams, with a concomitant increase in functional purity, information content and structural coverage. FunFam expansion increases the structural annotations provided for experimental GO terms (+59%). We also present CATH-FunVar web-pages displaying variations in protein sequences and their proximity to known or predicted functional sites. We present two case studies (1) putative cancer drivers and (2) SARS-CoV-2 proteins. Finally, we have improved links to and from CATH including SCOP, InterPro, Aquaria and 2DProt.

Asunto(s)

Biología Computacional/estadística & datos numéricos , Bases de Datos de Proteínas/estadística & datos numéricos , Dominios Proteicos , Proteínas/química , Secuencia de Aminoácidos , COVID-19/epidemiología , COVID-19/prevención & control , COVID-19/virología , Biología Computacional/métodos , Epidemias , Humanos , Internet , Anotación de Secuencia Molecular , Proteínas/genética , Proteínas/metabolismo , SARS-CoV-2/genética , SARS-CoV-2/metabolismo , SARS-CoV-2/fisiología , Análisis de Secuencia de Proteína/métodos , Homología de Secuencia de Aminoácido , Proteínas Virales/química , Proteínas Virales/genética , Proteínas Virales/metabolismo

10.

HARP: a database of structural impacts of systematic missense mutations in drug targets of Mycobacterium leprae.

Vedithi, Sundeep Chaitanya; Malhotra, Sony; Skwark, Marcin J; Munir, Asma; Acebrón-García-De-Eulate, Marta; Waman, Vaishali P; Alsulami, Ali; Ascher, David B; Blundell, Tom L.

Comput Struct Biotechnol J ; 18: 3692-3704, 2020.

Artículo en Inglés | MEDLINE | ID: mdl-33304465

RESUMEN

Computational Saturation Mutagenesis is an in-silico approach that employs systematic mutagenesis of each amino acid residue in the protein to all other amino acid types, and predicts changes in thermodynamic stability and affinity to the other subunits/protein counterparts, ligands and nucleic acid molecules. The data thus generated are useful in understanding the functional consequences of mutations in antimicrobial resistance phenotypes. In this study, we applied computational saturation mutagenesis to three important drug-targets in Mycobacterium leprae (M. leprae) for the drugs dapsone, rifampin and ofloxacin namely Dihydropteroate Synthase (DHPS), RNA Polymerase (RNAP) and DNA Gyrase (GYR), respectively. M. leprae causes leprosy and is an obligate intracellular bacillus with limited protein structural information associating mutations with phenotypic resistance outcomes in leprosy. Experimentally solved structures of DHPS, RNAP and GYR of M. leprae are not available in the Protein Data Bank, therefore, we modelled the structures of these proteins using template-based comparative modelling and introduced systematic mutations in each model generating 80,902 mutations and mutant structures for all the three proteins. Impacts of mutations on stability and protein-subunit, protein-ligand and protein-nucleic acid affinities were computed using various in-house developed and other published protein stability and affinity prediction software. A consensus impact was estimated for each mutation using qualitative scoring metrics for physicochemical properties and by a categorical grouping of stability and affinity predictions. We developed a web database named HARP (a database of Hansen's Disease Antimicrobial Resistance Profiles), which is accessible at the URL - https://harp-leprosy.org and provides the details to each of these predictions.

11.

The Genome3D Consortium for Structural Annotations of Selected Model Organisms.

Waman, Vaishali P; Blundell, Tom L; Buchan, Daniel W A; Gough, Julian; Jones, David; Kelley, Lawrence; Murzin, Alexey; Pandurangan, Arun Prasad; Sillitoe, Ian; Sternberg, Michael; Torres, Pedro; Orengo, Christine.

Methods Mol Biol ; 2165: 27-67, 2020.

Artículo en Inglés | MEDLINE | ID: mdl-32621218

RESUMEN

Genome3D consortium is a collaborative project involving protein structure prediction and annotation resources developed by six world-leading structural bioinformatics groups, based in the United Kingdom (namely Blundell, Murzin, Gough, Sternberg, Orengo, and Jones). The main objective of Genome3D serves as a common portal to provide both predicted models and annotations of proteins in model organisms, using several resources developed by these labs such as CATH-Gene3D, DOMSERF, pDomTHREADER, PHYRE, SUPERFAMILY, FUGUE/TOCATTA, and VIVACE. These resources primarily use SCOP- and/or CATH-based protein domain assignments. Another objective of Genome3D is to compare structural classifications of protein domains in CATH and SCOP databases and to provide a consensus mapping of CATH and SCOP protein superfamilies. CATH/SCOP mapping analyses led to the identification of total of 1429 consensus superfamilies.Currently, Genome3D provides structural annotations for ten model organisms, including Homo sapiens, Arabidopsis thaliana, Mus musculus, Escherichia coli, Saccharomyces cerevisiae, Caenorhabditis elegans, Drosophila melanogaster, Plasmodium falciparum, Staphylococcus aureus, and Schizosaccharomyces pombe. Thus, Genome3D serves as a common gateway to each structure prediction/annotation resource and allows users to perform comparative assessment of the predictions. It, thus, assists researchers to broaden their perspective on structure/function predictions of their query protein of interest in selected model organisms.

Asunto(s)

Genómica/organización & administración , Bases del Conocimiento , Anotación de Secuencia Molecular/métodos , Proteoma/química , Animales , Arabidopsis , Genoma , Genómica/métodos , Humanos , Difusión de la Información , Alineación de Secuencia/métodos , Reino Unido , Levaduras

12.

Mycobacterial genomics and structural bioinformatics: opportunities and challenges in drug discovery.

Waman, Vaishali P; Vedithi, Sundeep Chaitanya; Thomas, Sherine E; Bannerman, Bridget P; Munir, Asma; Skwark, Marcin J; Malhotra, Sony; Blundell, Tom L.

Emerg Microbes Infect ; 8(1): 109-118, 2019.

Artículo en Inglés | MEDLINE | ID: mdl-30866765

RESUMEN

Of the more than 190 distinct species of Mycobacterium genus, many are economically and clinically important pathogens of humans or animals. Among those mycobacteria that infect humans, three species namely Mycobacterium tuberculosis (causative agent of tuberculosis), Mycobacterium leprae (causative agent of leprosy) and Mycobacterium abscessus (causative agent of chronic pulmonary infections) pose concern to global public health. Although antibiotics have been successfully developed to combat each of these, the emergence of drug-resistant strains is an increasing challenge for treatment and drug discovery. Here we describe the impact of the rapid expansion of genome sequencing and genome/pathway annotations that have greatly improved the progress of structure-guided drug discovery. We focus on the applications of comparative genomics, metabolomics, evolutionary bioinformatics and structural proteomics to identify potential drug targets. The opportunities and challenges for the design of drugs for M. tuberculosis, M. leprae and M. abscessus to combat resistance are discussed.

Asunto(s)

Proteínas Bacterianas/química , Biología Computacional/métodos , Mycobacterium/genética , Análisis de Secuencia de ADN/métodos , Animales , Proteínas Bacterianas/metabolismo , Descubrimiento de Drogas , Farmacorresistencia Bacteriana , Genoma Bacteriano , Humanos , Anotación de Secuencia Molecular , Mycobacterium/metabolismo , Mycobacterium abscessus/genética , Mycobacterium abscessus/metabolismo , Mycobacterium leprae/genética , Mycobacterium leprae/metabolismo , Mycobacterium tuberculosis/genética , Mycobacterium tuberculosis/metabolismo , Conformación Proteica , Proteómica

13.

Genetic diversity and evolution of dengue virus serotype 3: A comparative genomics study.

Waman, Vaishali P; Kale, Mohan M; Kulkarni-Kale, Urmila.

Infect Genet Evol ; 49: 234-240, 2017 04.

Artículo en Inglés | MEDLINE | ID: mdl-28126562

RESUMEN

Dengue virus serotype 3 (DENV-3), one of the four serotypes of Dengue viruses, is geographically diverse. There are five distinct genotypes (I-V) of DENV-3. Emerging strains and lineages of DENV-3 are increasingly being reported. Availability of genomic data for DENV-3 strains provides opportunity to study its population structure. Complete genome sequences are available for 860 strains of four genotypes (I, II, III and V) isolated worldwide and were analyzed using population genetics and evolutionary approaches to map landscape of genomic diversity. DENV-3 population is observed to be stratified into five major subpopulations. Genotype I and II formed independent subpopulations while genotype III is subdivided into three subpopulations (GIII-a, GIII-b and GIII-c) and is therefore heterogeneous. Genotypes I, II and GIII-a subpopulations comprise of Asian strains whereas GIII-c comprises of American strains. GIII-b subpopulation includes mainly of American strains along with a few strains from Sri Lanka. Genetic admixture is predominantly observed in Sri Lankan strains of genotype III and all strains of genotype V. Inter-genotype recombination was observed to occur in non-structural region of several Asian strains whereas extent of recombination was limited in American strains. Significant positive selection was found to be operational on all genes and observed to be the main driving force of genetic diversity. Positive selection was strongly operational on the branches leading to Asian genotypes and helped to delineate the genetic differences between Asian and American lineages. Thus, inter-genotype recombination, migration and adaptive evolution are the major determinants of evolution of DENV-3.

Asunto(s)

Virus del Dengue/genética , Dengue/epidemiología , Genoma Viral , Genotipo , Filogenia , Serogrupo , Asia/epidemiología , Evolución Biológica , Dengue/virología , Virus del Dengue/clasificación , Virus del Dengue/aislamiento & purificación , Variación Genética , Humanos , Epidemiología Molecular , América del Norte/epidemiología , Filogeografía , Recombinación Genética , Selección Genética , América del Sur/epidemiología

14.

Analysis of genotype diversity and evolution of Dengue virus serotype 2 using complete genomes.

Waman, Vaishali P; Kolekar, Pandurang; Ramtirthkar, Mukund R; Kale, Mohan M; Kulkarni-Kale, Urmila.

PeerJ ; 4: e2326, 2016.

Artículo en Inglés | MEDLINE | ID: mdl-27635316

RESUMEN

BACKGROUND: Dengue is one of the most common arboviral diseases prevalent worldwide and is caused by Dengue viruses (genus Flavivirus, family Flaviviridae). There are four serotypes of Dengue Virus (DENV-1 to DENV-4), each of which is further subdivided into distinct genotypes. DENV-2 is frequently associated with severe dengue infections and epidemics. DENV-2 consists of six genotypes such as Asian/American, Asian I, Asian II, Cosmopolitan, American and sylvatic. Comparative genomic study was carried out to infer population structure of DENV-2 and to analyze the role of evolutionary and spatiotemporal factors in emergence of diversifying lineages. METHODS: Complete genome sequences of 990 strains of DENV-2 were analyzed using Bayesian-based population genetics and phylogenetic approaches to infer genetically distinct lineages. The role of spatiotemporal factors, genetic recombination and selection pressure in the evolution of DENV-2 is examined using the sequence-based bioinformatics approaches. RESULTS: DENV-2 genetic structure is complex and consists of fifteen subpopulations/lineages. The Asian/American genotype is observed to be diversified into seven lineages. The Asian I, Cosmopolitan and sylvatic genotypes were found to be subdivided into two lineages, each. The populations of American and Asian II genotypes were observed to be homogeneous. Significant evidence of episodic positive selection was observed in all the genes, except NS4A. Positive selection operational on a few codons in envelope gene confers antigenic and lineage diversity in the American strains of Asian/American genotype. Selection on codons of non-structural genes was observed to impact diversification of lineages in Asian I, cosmopolitan and sylvatic genotypes. Evidence of intra/inter-genotype recombination was obtained and the uncertainty in classification of recombinant strains was resolved using the population genetics approach. DISCUSSION: Complete genome-based analysis revealed that the worldwide population of DENV-2 strains is subdivided into fifteen lineages. The population structure of DENV-2 is spatiotemporal and is shaped by episodic positive selection and recombination. Intra-genotype diversity was observed in four genotypes (Asian/American, Asian I, cosmopolitan and sylvatic). Episodic positive selection on envelope and non-structural genes translates into antigenic diversity and appears to be responsible for emergence of strains/lineages in DENV-2 genotypes. Understanding of the genotype diversity and emerging lineages will be useful to design strategies for epidemiological surveillance and vaccine design.

15.

Population genomics of dengue virus serotype 4: insights into genetic structure and evolution.

Waman, Vaishali P; Kasibhatla, Sunitha Manjari; Kale, Mohan M; Kulkarni-Kale, Urmila.

Arch Virol ; 161(8): 2133-48, 2016 Aug.

Artículo en Inglés | MEDLINE | ID: mdl-27169727

RESUMEN

The spread of dengue disease has become a global public health concern. Dengue is caused by dengue virus, which is a mosquito-borne arbovirus of the genus Flavivirus, family Flaviviridae. There are four dengue virus serotypes (1-4), each of which is known to trigger mild to severe disease. Dengue virus serotype 4 (DENV-4) has four genotypes and is increasingly being reported to be re-emerging in various parts of the world. Therefore, the population structure and factors shaping the evolution of DENV-4 strains across the world were studied using genome-based population genetic, phylogenetic and selection pressure analysis methods. The population genomics study helped to reveal the spatiotemporal structure of the DENV-4 population and its primary division into two spatially distinct clusters: American and Asian. These spatial clusters show further time-dependent subdivisions within genotypes I and II. Thus, the DENV-4 population is observed to be stratified into eight genetically distinct lineages, two of which are formed by American strains and six of which are formed by Asian strains. Episodic positive selection was observed in the structural (E) and non-structural (NS2A and NS3) genes, which appears to be responsible for diversification of Asian lineages in general and that of modern lineages of genotype I and II in particular. In summary, the global DENV-4 population is stratified into eight genetically distinct lineages, in a spatiotemporal manner with limited recombination. The significant role of adaptive evolution in causing diversification of DENV-4 lineages is discussed. The evolution of DENV-4 appears to be governed by interplay between spatiotemporal distribution, episodic positive selection and intra/inter-genotype recombination.

Asunto(s)

Virus del Dengue/genética , Dengue/virología , Evolución Molecular , Genoma Viral , Virus del Dengue/clasificación , Virus del Dengue/aislamiento & purificación , Variación Genética , Genómica , Genotipo , Humanos , Filogenia , Proteínas Virales/genética , Proteínas Virales/metabolismo

16.

RV-Typer: A Web Server for Typing of Rhinoviruses Using Alignment-Free Approach.

Kolekar, Pandurang S; Waman, Vaishali P; Kale, Mohan M; Kulkarni-Kale, Urmila.

PLoS One ; 11(2): e0149350, 2016.

Artículo en Inglés | MEDLINE | ID: mdl-26870949

RESUMEN

Rhinoviruses (RV) are increasingly being reported to cause mild to severe infections of respiratory tract in humans. RV are antigenically the most diverse species of the genus Enterovirus and family Picornaviridae. There are three species of RV (RV-A, -B and -C), with 80, 32 and 55 serotypes/types, respectively. Antigenic variation is the main limiting factor for development of a cross-protective vaccine against RV.Serotyping of Rhinoviruses is carried out using cross-neutralization assays in cell culture. However, these assays become laborious and time-consuming for the large number of strains. Alternatively, serotyping of RV is carried out by alignment-based phylogeny of both protein and nucleotide sequences of VP1. However, serotyping of RV based on alignment-based phylogeny is a multi-step process, which needs to be repeated every time a new isolate is sequenced. In view of the growing need for serotyping of RV, an alignment-free method based on "return time distribution" (RTD) of amino acid residues in VP1 protein has been developed and implemented in the form of a web server titled RV-Typer. RV-Typer accepts nucleotide or protein sequences as an input and computes return times of di-peptides (k = 2) to assign serotypes. The RV-Typer performs with 100% sensitivity and specificity. It is significantly faster than alignment-based methods. The web server is available at http://bioinfo.net.in/RV-Typer/home.html.

Asunto(s)

Filogenia , Infecciones por Picornaviridae/virología , Rhinovirus/clasificación , Rhinovirus/genética , Serotipificación/métodos , Proteínas de la Cápside/genética , Genes Virales , Humanos , Internet , Programas Informáticos

17.

Population structure and evolution of Rhinoviruses.

Waman, Vaishali P; Kolekar, Pandurang S; Kale, Mohan M; Kulkarni-Kale, Urmila.

PLoS One ; 9(2): e88981, 2014.

Artículo en Inglés | MEDLINE | ID: mdl-24586469

RESUMEN

Rhinoviruses, formerly known as Human rhinoviruses, are the most common cause of air-borne upper respiratory tract infections in humans. Rhinoviruses belong to the family Picornaviridae and are divided into three species namely, Rhinovirus A, -B and -C, which are antigenically diverse. Genetic recombination is found to be one of the important causes for diversification of Rhinovirus species. Although emerging lineages within Rhinoviruses have been reported, their population structure has not been studied yet. The availability of complete genome sequences facilitates study of population structure, genetic diversity and underlying evolutionary forces, such as mutation, recombination and selection pressure. Analysis of complete genomes of Rhinoviruses using a model-based population genetics approach provided a strong evidence for existence of seven genetically distinct subpopulations. As a result of diversification, Rhinovirus A and -C populations are divided into four and two subpopulations, respectively. Genetically, the Rhinovirus B population was found to be homogeneous. Intra-species recombination was observed to be prominent in Rhinovirus A and -C species. Significant evidence of episodic positive selection was obtained for several sites within coding sequences of structural and non-structural proteins. This corroborates well with known phenotypic properties such as antigenicity of structural proteins. Episodic positive selection appears to be responsible for emergence of new lineages especially in Rhinovirus A. In summary, the Rhinovirus population is an ensemble of seven distinct lineages. In case of Rhinovirus A, intra-species recombination and episodic positive selection contribute to its further diversification. In case of Rhinovirus C, intra- and inter-species recombinations are responsible for observed diversity. Population genetics approach was further useful to analyze phylogenetic tree topologies pertaining to recombinant strains, especially when trees are derived using complete genomes. Understanding of population structure serves as a foundation for designing new vaccines and drugs as well as to explain emergence of drug resistance amongst subpopulations.

Asunto(s)

Evolución Molecular , Variación Genética , Rhinovirus/genética , Ligamiento Genético , Genoma Viral/genética , Humanos , Filogenia , ARN Viral/genética , Recombinación Genética , Infecciones del Sistema Respiratorio/virología , Rhinovirus/clasificación , Análisis de Secuencia de ARN

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA